Filtering Syntactic Constraints for Statistical Machine Translation
نویسندگان
چکیده
Source language parse trees offer very useful but imperfect reordering constraints for statistical machine translation. A lot of effort has been made for soft applications of syntactic constraints. We alternatively propose the selective use of syntactic constraints. A classifier is built automatically to decide whether a node in the parse trees should be used as a reordering constraint or not. Using this information yields a 0.8 BLEU point improvement over a full constraint-based system.
منابع مشابه
Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation
In this work, we investigate the effectiveness of two techniques for a featurebased integration of syntactic information into GHKM string-to-tree statistical machine translation (Galley et al., 2004): (1.) Preference grammars on the target language side promote syntactic wellformedness during decoding while also allowing for derivations that are not linguistically motivated (as in hierarchical ...
متن کاملRule Selection with Soft Syntactic Features for String-to-Tree Statistical Machine Translation
In syntax-based machine translation, rule selection is the task of choosing the correct target side of a translation rule among rules with the same source side. We define a discriminative rule selection model for systems that have syntactic annotation on the target language side (stringto-tree). This is a new and clean way to integrate soft source syntactic constraints into string-to-tree syste...
متن کاملExtending CCG-based Syntactic Constraints in Hierarchical Phrase-Based SMT
In this paper, we describe two approaches to extending syntactic constraints in the Hierarchical Phrase-Based (HPB) Statistical Machine Translation (SMT) model using Combinatory Categorial Grammar (CCG). These extensions target the limitations of previous syntax-augmented HPB SMT systems which limit the coverage of the syntactic constraints applied. We present experiments on Arabic–English and ...
متن کاملA Maximum Entropy Approach to Syntactic Translation Rule Filtering
In this paper we will present a maximum entropy filter for the translation rules of a statistical machine translation system based on tree transducers. This filter can be successfully used to reduce the number of translation rules by more than 70% without negatively affecting translation quality as measured by BLEU. For some filter configurations, translation quality is even improved. Our inves...
متن کاملPreference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation
We propose a novel probabilistic synchoronous context-free grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as “soft” preferences rather than as “hard” matching constraints. This formalism allows us to efficiently score unlabeled synchronous derivations without forgoing traditional syntactic constraints. Using this score as a feature i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010